Testing whether r is statistically significantly different from zero
Before beginning your calculations for correlation coefficients, remember that the data used in a
correlation — the “ingredients” to a correlation — are the values of two variables referring to the
same experimental unit. An example would be measurements of height (X) and weight (Y) in a sample
of individuals. Because your raw data (the X and Y values) always have random fluctuations due to
either sampling error or measurement imprecision, a calculated correlation coefficient is also subject
to random fluctuations.
Even when X and Y are completely independent, your calculated r value is almost never exactly zero.
One way to test for a statistically significant association between X and Y is to test whether r is
statistically significantly different from zero by calculating a p value from the r value (see Chapter 3
for a refresher on p values).
The correlation coefficient has a strange sampling distribution, so it is not useful for statistical testing.
Instead, the quantity t can be calculated from the observed correlation coefficient r, based on N
observations, by the formula
. Because t fluctuates in accordance with the
Student t distribution with
degrees of freedom (df), it is useful for statistical testing (see Chapter
11 for more about t).
For example, if
for a sample of 12 participants, then
, which works out to
, with 10 degrees of freedom. You can use the
online calculator at https://statpages.info/pdfs.html and calculate the p by entering the t and
df values. You can also do this in R by using the code:
2 * pt(q = 1.8257, df = 10, lower.tail = FALSE).
Either way, you get
, which is greater than 0.05. At α = 0.05, the r value of 0.500 is not
statistically significantly different from zero (see Chapter 12 for more about α).
How precise is an r value?
You can calculate confidence limits around an observed r value using a somewhat roundabout process.
The quantity z, calculated by the Fisher z transformation
, is approximately
normally distributed with a standard deviation of
. Therefore, using the formulas for normal-
based confidence intervals (see Chapter 10), you can calculate the lower and upper 95 percent
confidence limits around
and
. You can turn
these into the corresponding confidence limits around r by the reverse of the z transformation:
for
and
.
Here are the steps for calculating 95 percent confidence limits around an observed r value of 0.05 for
a sample of 12 participants (N = 12):
1. Calculate the Fisher z transformation of the observed r value:
2. Calculate the lower and upper 95 percent confidence limits for z:
3. Calculate the lower and upper 95 percent confidence limits for r: